Genealogical Indexing of Obituaries Using Automatic Processes

نویسندگان

  • Patrick Schone
  • Jake Gehring
چکیده

Due to the ability of modern obituaries to provide rich genealogical information for family members who have died within the bounds of “living memory,” family history organizations have recently begun to acquire and index obituaries in vast quantities. The indexing process for these documents is typically done using human labor. Yet we describe an effort by FamilySearch which leverages various kinds of machine learning, statistical analyses, and rule-based processing to automatically index such documents without human intervention at rates thousands of times faster than humans while still achieving high levels of accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مدل دو مرحله ای شکاف- گلچین برای نمایه سازی خودکار متون فارسی

Purpose: Each language has its own problems. This leads to consider appropriate models for automatic indexing of every language. These models should concern the exhaustificity and specificity of indexing.   This paper aims at introduction and evaluation of a model which is suited for Persian automatic indexing. This model suggests to break the text into the particles of candidate terms and to c...

متن کامل

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...

متن کامل

Mining Digital Imagery Data for Automatic Linguistic Indexing of Pictures

In this paper, we present a new research direction, automatic linguistic indexing of pictures, for data mining and machine learning researchers. Automatic linguistic indexing of pictures is an imperative but highly challenging problem. In our on-going research, we introduce a statistical modeling approach to this problem. Computer algorithms have been developed to mine numerical features automa...

متن کامل

Automatic video indexing with incremental gallery creation: integration of recognition and knowledge acquisition

A framework for integrating the processes of object recognition and knowledge acquisition is proposed and applied to solve a task of automatic video indexing based on personal appearance events in a video stream. Spatiotemporal segmentation using multiple cues and example-based adaptation of a known person gallery are combined in a prototype system which demonstrated successful results in our p...

متن کامل

Line-of-descent and genealogical processes, and their applications in population genetics models.

A variety of results for genealogical and line-of-descent processes that arise in connection with the theory of some classical selectively neutral population genetics models are reviewed. While some new results and derivations are included, the principle aim is to demonstrate the central importance and simplicity of genealogical Markov chains in this theory. Considerable attention is given to "...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016